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We prove that local random quantum circuits acting on n qubits composed of polynomi- 
ally many nearest neighbor two-qubit gates form an approximate unitary poly(n)-design. 
Previously it was unknown whether random quantum circuits were a i-design for any t > 3. 

The proof is based on an interplay of techniques from quantum many-body theory rep- 
resentation theory and the theory of Markov chains. In particular we employ a result of 
Nachtergaele for lower bounding the spectral gap of frustration-free quantum local Hamil- 
tonians; a quasi-orthogonality property of permutation matrices; a result of Oliveira which 
extends to the unitary group the path-coupling method for bounding the mixing time of 
random walks; and a result of Bourgain and Gamburd showing that dense subgroups of 
the special unitary group, composed of elements with algebraic entries, are oo-copy tensor- 
product expanders. 

We also consider pseudo-randomness properties of local random quantum circuits of 
small depth and prove they constitute a quantum poly(n)-copy tensor-product expander. 
The proof also rests on techniques from quantum many-body theory, in particular on the 
detectability lemma of Aharonov, Arad, Landau, and Vazirani. 

We give three applications of the results. First we show the following pseudo- 
randomness property of generic quantum circuits: Almost every circuit U of size n k on 
n qubits cannot be distinguished from a Haar uniform unitary by circuits of size n^ k+3 ^ 6 
that are given oracle access to U; this provides a data-hiding scheme against computation- 
ally bounded adversaries. Second we reconsider a recent argument of Masanes, Roncaglia, 
and Acin concerning local equilibration of time-evolving quantum systems, and strengthen 
the connection between fast equilibration of small subsystems and the circuit complexity 
of the unitary which diagonalizes the Hamiltonian. Third we show that in one dimension 
almost every parallel local circuit of linear depth generates topological order, matching an 
upper bound to the problem due to Bravyi, Hastings, and Verstraete. 



I. INTRODUCTION 



Random unitary matrices are an important resource is quantum information theory and quan- 
tum computing. Examples of the use of random unitaries, drawn from the Haar measure on 
the unitary group, include the encoding for almost every known protocol for sending informa- 
tion down a quantum channel U, approximate encryption of quantum information 10], quantum 
data-hiding [2], information locking [2], and solving certain instances of the hidden subgroup 



'Electronic address: fgslbrandao@gmail.com 
Electronic address: aram@cs.washington.edu 
* Electronic address: fizmh@ug.edu.pl 



2 



problem over non-abelian groups Yet random unitary matrices are unreasonable from a com- 
putational point of view: To implement a random Haar unitary one needs an exponential number 
of two-qubit gates and random bits. Thus it is interesting to explore constructions of pseudo- 
random unitaries, which can be efficiently implementable and can replace random unitaries in 
some respects. 

An approximate unitary t-design is a distribution of unitaries which mimic pr ope rties of the 
Haar measure for polynomials of degree up to t (in the entries of the unitaries) [4l4l7f]. Approxi- 
mate designs have a number of interesting applications in quantum information theory replacing 



the use of truly random unitaries (see e.g. |7|.|8I. I12|.I17h 20]). It has been a conjecture in the the 



ory of quantum pseudo-randomness that polynomial-size random quantum circuits on n qubits 
form an approximate unitary poly(n)-design [9]. Analogously, polynomial-size reversible circuits 
are known to form approximately poly(n)-wise independent permutations llzfll (see also $(2$\). 
However, up to now, the best result known for quantum circuits was that polynomial random 
quantum circuits are approximate unitary 3-designs jl^l , which improved on a series of papers 
establishing that random circuits are approximate unitary 2-design Jz-H, lj. Moreover, efficient 
constructions of quantum i-designs, using a polynomial number of quantum gates and random 
bits, were only known for t = 0[nj log(n)) |14f|. In this paper we make progress in the problem 
of unitary t-designs. We prove that local random quantum circuits acting on n qubits composed 
of polynomially many nearest neighbour two-qubit gates form an approximate unitary poly(ro)- 
design, settling the conjecture in the affirmative. 

In the remainder of this section, we will give the definitions and notation used in this paper. 
Then we will state the main result in Section [TT] and outline a few applications in Section [TTTJ The 
rest of the paper is devoted to the proof, with an overview in Section [TV] and the details in Sec- 
tionE 



A. Approximate Unitary Designs and Quantum Tensor-Product Expanders 



We start with the definition of tensor-product expanders [23], which are objects similar to ap- 
proximate unitary designs, but with the approximation to the Haar measure quantified differently. 
Let //Haar be the Haar measure on U(N) (the group of N x N unitary matrices). 

Definition 1. Let u be a distribution on V(N). Then v is a (N,X,t) quantum t-copy tensor-product 
expander (or TPEfor short) if 



g(u,t) :-- 



U m '^(dU) 



< A, 



(1) 



with 17®*'* := 



'V(N) JV(N) 

(U*) m . We say v is a (N, A, oo)-TPE if it is a (N, X, t)-TPEfor all t. 



This definition is meant to generalize the spectral characterization of expander graphs, and has 
the similar advantage that | supp(^)| can be constant even for constant A and unbounded N, t (c.f. 
J23ll ). Another advantage is that the TPE condition can be naturally amplified. For a distribution 
v, let v* k be the A;-fold convolution of v, i.e. 



,*k 



5 Ul ...u k v{dUi)...v(dU k ). 



(2) 



Then it follows immediately from (H) and the fact that f v r N \ /jLn aai (dU) is a projector that 



g(v**,t)=g(u,ty 



(3) 



3 



Definition [TJ can also be expressed in terms of quantum operations. Define ad^/fX] := UXW. 
For a distribution v on U(N) let 

K,t{p)'-= I Bd v « t \p]v(dU)= [ U^p(U^(dU). (4) 

JV(N) JV(N) 
Define, for any p > 1, the superoperator norms 

||T|U P := sup ^g^, (5) 

where \\X\\ P := (tr|X| p ) 1 / p are the Schatten norms. An alternate definition of the TPE condition is 
then: 

g{y,t) = || - A AtHaar J| 2 ^ 2 . (6) 

In many applications, however, it is often more natural to work with measures such as the trace 
distance. For example, we would like to argue that sampling U from v and using it t times results 
in a state that is e-close to one that would be obtained by sampling U from the Haar measure. This 
will lead us to the notion of an approximate unitary design (also called an e-approximate i-design 
when we want to emphasize the parameters) 

Previous research has used several definitions of e-approximate t-designs, such as replacing 
the || • | |oo and A in (Q]) with || • ||i and e, or replacing the 2 — > 2 norm in |(6} with the diamond norm 
(defined below). See 12411 for a comparison of these, and other, ways of defining approximate 
unitary designs. 

Here we propose a stronger definition of approximate designs, which was suggested to us by 
Andreas Winter. First, if A/i , A/2 are superoperators, then we say that A/i ^ A2 iff A/2 — A/i is 
completely positive, or equivalently if 

(A/i fg> id)($jv) d (A/2 fg> id)(*jv), (7) 

where is the input dimension of A/j.,2/ ^ here denotes the usual semidefinite ordering, and 
|$iv) = N" 1 / 2 Yli=i \h i) ls the standard maximally entangled state on iV x N dimensions. 

Definition 2. Let v be a distribution on U(iV). Then v is an e-approximate unitary t-design if 

(l-OA MH-rit r<A 1/it r<(l + e)A Wtarit (8) 

or equivalently 

(1 - e)(A mmnt ® id)(<&®*) H (A,,, ® id)(«&®*) d (1 + e)(A MHKiM ® id)(*®*) (9) 
For brevity, let G(u, t) denote the smallest efor which © holds. 

The advantage of Definition [2] is that for any state on t systems that is acted upon by a ran- 
dom U® 1 and then measured, the probability of any measurement outcome will change by only a 
small multiplicative factor whether U is drawn from v or the Haar measure. To relate our design 
definition to the distinguishability of quantum operations, we first define the diamond norm |25f ] 
of a superoperator T as follows 

||T|| :=sup||T®id d ||i^i, (10) 
d 
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Lemma 3. If v is an e-approximate unitary t-design, then \\A mmrt — A Uit \\o < 2e. Conversely, if 

"iaar,t 



| A m(rar t - A^tHo < e then v is an eN 2i -design. 



The proof is in Appendix A 



The reason we should expect Lemma|3]to be true is that all norms on finite-dimensional spaces 
are equivalent, and every definition of an approximate design is based on some norm of — 
A mRai: t- In practice, the norms we are interested in always differ by factors that are polynomial 
in dimension, which here means N°^. See Lemma 2.2.14 of ||2J] for many more examples of this 
phenomenon. 

To prove our main result about circuits being unitary designs, we will take the common path 
of first showing that they are TPEs and then converting this result into a statement about being 
designs. This conversion again loses a dimensional factor. 



Lemma 4. Let u be a distribution on V(N). Then 

<G{v,t)<N M g{v,t). 



2iV*/2 



This lemma is proved in Appendix A 



II. MAIN RESULTS 
A. Haar Uniform Gates 

We consider the following two models of random quantum circuits, defined as random walks 
on U(cP) for an integer d: 

• Local random circuit: In each step of the walk an index i is chosen uniformly at random from 
[n — 1] and a unitary Ui^ + \ drawn from the Haar measure on V(d 2 ) is applied to the two 
neighbouring qudits i and i + 1. 

• Parallel local random circuit: In each step either the unitary ® ® ••• <S> U n -\^ n or the 
unitary C/2,3 <8> ... <8) U n -2, n -i is applied (each with probability 1/2), with Ujj + \ independent 
unitaries drawn from the Haar measure on U(d 2 ). (This assumes n is even.) 



We note the local random circuit model was considered previously in Refs. II12H and 11711 , while 
a related model of pa rallel local random circuits, using a different set of quantum gates, was 
considered in Ref. [la]. 

Denote the distribution over one step of a local random circuit by ^LR,d,n an d over one step 
of a parallel local random circuits by ^pLR,d,n- The distributions over circuits of length k can be 
written as (vLR,d,n)* k or (vpLR,d,n)* k , respectively. The main result of this paper is the following: 

Theorem 5. 

2- gfoRAnit) < 1 - nt ±i og(t y 
2- g(vpLR,d,n,t) < 1 - 12P Lg(t) • 
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A direct consequence of this theorem, 10 and Lemma|4]is the following corollary about forming 
e-approximate i-designs. 

Corollary 6. 

1. Local random circuits of length log(t)t 4: n(2ntlog(d) + log(l / e)) form e-approximate t-designs. 

2. Parallel local random circuits of length 12log(t)t i (2ntlog(d) + log(l/e)) form e-approximate t- 
designs. 



B. Other Universal Sets of Gates 



Consider a set of gates G := {<7i}™ 1 with each §i £ U(d 2 ). We say G is universal if the group 
generated by it is dense in \J(d 2 ), i.e. for every g £ U(<i 2 ) and for every e > we can find a 
sequence *l) 6 [ rra ] L such that ||g — ...g^ || < e. We say that the set G contains inverses if 

g~ x £ G whenever g £ G. 

We can now consider random walks associated to an universal set of gates G — {gil^i" 

• G-local random circuit: In each step of the walk two indices i, k are chosen uniformly at ran- 
dom from [n — 1] and [m], respectively and the unitary gj, is applied to the two neighboring 
qudits i and i + 1. 

• G-parallel local random circuit: In each step either the unitary <8> <S> ... <S> ^n-i,n or the 
unitary C/2,3 <%> ••• ® ^n-2,n-i is applied (each with probability 1/2), with Ujj + \ independent 
unitaries drawn uniformly from G. 

Corollary |6]only considered the case of a Haar uniform set of gates in U (d 2 ). A natural question 
is whether one can prove similar results for other universal set of gates. It turns out that combining 
Theorem [5] with the result of Ref. [26] one can indeed do so, at least for a large class of gate sets: 

Corollary 7. Fix d > 2. Let G = {gi}^ be a universal set of gates containing inverses, with each 
gi £ SU(d 2 ) composed of algebraic entries. Then there exists C = C(G) > such that 

1. G-local random circuits of length C log(l/e) log(t)t 5 n 2 form e-approximate t-designs. 

2. G-parallel local random circuits of length Clog(l/e) log(t)t 5 nform e-approximate t-designs. 

The main tool behind the proof of Corollary [7] is the beautiful result of Bourgain and Gamburd 



|26f | establishing that any finite universal set of gates in SU (N), containing inverses and with 
elements composed of algebraic entries, is an infinite tensor-product expander with nonzero gap. 
We note that the proof in Ref. J26[ | does not give any estimate of the dependency of the spectral 
gap on TV. That is the reason why Corollary [7| also does not specify the dependency of the local 
dimension d on the size of the circuit. (And of course, this gap can be arbitrarily small, e.g. if the 
gates in G are all very close to the identity.) 
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C. Optimality of Results 

It is worth asking whether these results can be improved. In Theorem [5l we suspect that the 
dependence on t could be improved, perhaps even to obtain a gap that is independent of t. In 
other words, we cannot rule out the possibility that random quantum circuits form a (d n , A, oo)- 
TPE for A < 1 — 1/ poly(n). Indeed, taking v to be uniform over even a constant number of 
Haar-random unitaries from U(iV) yields a (N, A, t)-TPE for (according to jzil) A constant and t as 
large as jV 1 / 6-1 ^ 1 ), and (according to H^]]) A < 1 and t = oo (but with uncontrolled A^-dependence, 
and over a measure not quite the same as Haar). On the other hand, we can easily see that the 
n-dependence of part 1 of Theorem [5] cannot be improved, and the bound in part 2 is already 
independent of n. Even when t = 1, we can consider the action of a random circuit on a state 
whose first qudit is in a pure state and the remaining qubits are maximally mixed. Under one 
step of local random circuits, this state will change by only 0(l/n). 

What about designs? Here, we can prove that neither the t nor n dependence can be improved 
by more than polynomial factors. 

Proposition 8. Let v be a distribution with support on circuits of size smaller than r. Suppose that v is 
an e-approximate t-design on n audits with e < 1/4 and t < d n l 2 . Then 

nt 

r ~ 5d 4 ln(ni)' ( } 

We believe that the restriction on e could be relaxed to e < 1, at the cost of some more algebra. 
However, once t ~ d n , our lower bound must stop improving, since 0{d 2n ) two-qudit gates suffice 



to implement any unitary M27I1 , and in particular to achieve the Haar measure. 



D. Classical analogues 

Random classical reversible circuits of size 0(n 3 t 2 log(n) log(l/e)) are known to generate t- 
designs | |2^ | (with the caveat that 2-bit reversible gate s are not universal, so the base distribution 
needs to be over random 3-bit gates). Other work |22l.l29f l implies that the number of random bits 
in these constructions can be reduced to a nearly-optimal 0(nt+log(l/e)). Implicit in much of this 
work (e.g. see |j3otl) was an application similar to our Application I (below): namely, producing 
permutations that could not be easily distinguished from a uniformly random permutation. 

Our techniques may be able to yield an alternate proof of |j28l|, possibly with sharper param- 
eters. However, doing so would run into the difficulty that Lemma [16] appears not to hold in 
the classical case. To explain this, we introduce some notation. Let II h [2t] indicate that II is a 
partition of [2t], let (x, y) G II indicate that x, y are in the same block of n, and let E u C [N] 2t be 
the set {(ii, . . . , i^t) '■ (x, y) € II => i x = i y }. Then the invariant subspace of {£7®*-* : U G S(N)} is 
spanned by the states 

\En)-=-iL= \h,---,ht) Vnh[2t] (13) 



However, here the analogy breaks down, because (HD is known to fail for the states {|£n)} |31f|. 
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III. APPLICATIONS 



Below we give three applications of Corollary [6j each of which might be of independent inter- 
est. 

Application I: Fooling small quantum circuits. A first application of Corollary [6] is related to 
pseudo-randomness properties of efficiently generated states and unitaries. A folklore result in 
quantum information theory says that the overwhelming majority of quantum states on n qubits 
cannot be distinguished (with a non-negligible bias) from the maximally mixed state by any mea- 
surement which can be implemented by subexponential-sized quantum circuits 11321, l33ll . Thus, 
even though such states are very well distinguishable from the maximally mixed state (since they 
are pure), this is only the case by using unreasonable measurements from a computational point 
of view. A drawback of this result is that the states themselves require exponential-sized quan- 
tum circuits even to be approximated (by applying the circuit to the \0)® n state). For given an 
n-qubit state which can be prepared by a circuit of k gates, one can always distinguish it from the 
maximally mixed state by a measurement implementable by k + 0(n) gates: One simply applies 
the conjugate unitary to the circuit which creates the state (which is also a circuit of k gates) and 
measures if one has the 1 0) ® n state or not. 



An interesting question in this respect posed by Aaronson |34f | is the following: Can such a 
form of data hiding (of whether one has a particular pure state or the maximally mixed state) 
against bounded-sized quantum circuits be realized efficiently? More concretely, can we find a 
state which can be created by a circuit of size s, yet is indistinguishable from the maximally mixed 
state by any measurement implementable by circuits of size r, for r not too much smaller than s? 
Using Corollary [6] we can show that this is indeed the case. In fact, this is a generic property of 
states that can be created by circuits of size s: 



Corollary 9. Let {v L R,d,nY 

circuit model. Then 



be the distribution on U(d n ) induced by s steps of the local random quantum 



max 

M G size(r) 



{0 n \rtMU\0 r 



tr{M) 



with 



t 



n 3 log(d) log(s) 



> S 



1/6 



2rd 4 



,.'560*y/ 4 

1 an* J 



(14) 



(15) 

M} which can be imple- 



and where the maximization is realized over all two-outcome POVMs {M, id 
mented by quantum circuits of size r. 

In particular, for fixed d, all but a 2~ n ^ -fraction of states generated by circuits of size n k (for k > 4) 
cannot be distinguished from the maximally mixed state with bias larger than n~ n ^ by any circuit of size 
n~ polylog _1 (n). 



A direct consequence of Corollary [9] is connected to the problem of quantum circuit minimiza- 
tion. There we are given a quantum circuit consisting of s gates and would like to determine the 
minimum number of gates which are needed to approximate the original circuit. We define 

C £ (U) := min{&; : there exists V with k gates s.t. \\V — J7||oo < e} • (16) 

Then we can give a lower bound on C e (U) for a generic circuit U using Corollary [9] as follows 
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Corollary 10. All but a 2 n(n ) -fraction of quantum circuits U of size n k (with respect to the measure 

(fLR.d.n)* n induced by n k steps of the local random circuit model) satisfy C e (U) > n"Nr polylog _1 (n) 
with e := 1 — n - ^ 1 ). 



A final result in this direction is that given a circuit in which Haar random unitaries are used a 
polynomial number of times, replacing them by random circuits only incurs a small error. 

Corollary 11. Let Cu be a quantum circuit of size r on< r qudits that makes use of a unitary oracle U on 
n < r qudits. That is, each gate in Cy can either apply an arbitrary tivo-qudit gate to any pair of qudits, 
or can apply U to its first n qudits. Then 



&d C v ^LR,d,n( dC/ ) 



V(d n ) 



u /^Haar 



< e, 



(17) 



for any e > and s > nr 4 log (r)(6nr log (d) + log(l/e)). In other words, random circuits cannot be 
distinguished from Haar-random unitaries by significantly shorter circuits. 



The proofs of Corollary [9] and Corollary [TT] are in Appendix D 



Application II: Fast quantum equilibration. A second application of Corollary [6] is related to 
dynamical equilibration of subsystems of a time-evolving quantum system. Understanding how a 
quantum system equilibrates despite unitary global dynamics is a long-standing problem (see e.g. 
I35fl). Rece ntly several new insights have been achieved using ideas from quantum information 
theory 1 3^- 43h. Here we show how the results presented so far can be used to strengthen a recent 
connection 14111 between the time of equlibration of small subsystems of a closed quantum system 
and the circuit complexity of the unitary which diagonalizes the Hamiltonian of the system. 

Consider a quantum Hamiltonian on n qudits (a Hermitian operator on (C d )® n ) which can be 
writen as 



H = UDU\ 



(18) 



with D = diag(£i , . . . , E^n ) a diagonal matrix in the computational basis formed by the eigenval- 
ues of H and U a unitary matrix. We divide the system into two subsystems S and E, where S 
should be seen as a small subsystem and E as a bath for S. We consider an arbitrary initial state 
Pse{0) and its time-evolved version psE(t) = e~ lHt psE(0)e tHt . We are interested in the question 
of how quickly the subsystem state ps(t) = tr^ (psE{t)) reaches equilibrium (if it equilibrates in 
the first place). The equilibrium state is denoted by uis and is given by the reduced state of S of 
the time-averaged state 



USE ■= lim - 

T— s-oo T 



PSE(t)dt = ^P k (p SE (0))Pk, 

k 



(19) 



with Pfc the eigenprojectors of the Hamiltonian H. 

InRefs. Jiffl] the square 2-norm average distance between ps(t) and ujs was computed for 
a Hamiltonian with U chosen from the Haar measure in U(cP): 



Lemma 12 (Theorem 3 of [42]). 



tr (( PS (t) - LOsf) PHaar(dU) 



d s d 2 



+ 



ES 



a ES 



+ 



d 2 

a ES 



d E 



(20) 
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where c is a absolute constant, cIe, ds are the dimensions of heat bath and the system, respectively, dES 
dEds = d n is the dimension of the total system, 



d" 



C = J>^, 7, = ^™* (21) 

k=l k=l 

and x = dt> w ^ ^k = dim (P/-) being the dimension ofeigenspace P^- 

It follows from LemmalT2"lthat for a non-degenerate Hamiltonian the time average of the R.H.S. 
of Eq. pO) - over times of order of the inverse of the average energy gap - will be small (see Refs. 



140-14211). Thus a Hamiltonian whose basis is chosen according to the Haar measure and whose 
spectrum have on average large energy gaps (which is expected to be the case typically) will 
equilibrate rapidly. 

In Ref. jU, Masanes, Roncaglia and Acin noted that the average computed in Lemma [121 only 
involves polynomials in the entries of U of degree 4. Therefore one could consider the average 
over an e-approximate unitary 4-design instead of the Haar measure and obtain the same result, 
up to an additive error of e. Also in Ref. |41f ] an interesting connection of fast equilibration and 
the circuit complexity of U was put forward: It was argued that, assuming that random circuits 
of length 0(n 3 ) form an approximate unitary 4-design, then the Hamiltonians of most circuits 
of such size enjoy fast equilibration of small subsystems. Conversely, a simple argument shows 
that circuits with complexity less than linear cannot lead to quick equilibration. Therefore there 
appears to exist a connection between fast local equilibration and the circuit complexity of the 
unitary diagonalizing the Hamiltonian. 

Corollary [6] allows us to strengthen this connection as follows: 

Corollary 13. For every 5 > 0, all but a 5-fraction of quantum circuits U of size n k (with respect to 

the measure [uLR,d,n)* n induced by n k steps of the local random circuit model) are such that, with H = 
UDW, 

tr ((p s (T) - us) 2 ) < \ f / tr ((p s (T) - u s ) 2 ) PmaMU) + 2'^) , (22) 
d \Jv(d n ) J 



and 



with e := 1 — n 



k-\-3 

C e {U)>n— polylog-V), (23) 



Using Corollary [7] we can also obtain an analogous statement for any universal set of gates 
containing inverses with elements formed by algebraic entries. 



In words, our results allows us to confirm the expectation of Ref. [41] that random circuits 
form an approximate unitary 4-design and also show that most such circuits indeed have large 
circuit complexity. The latter is useful information because one could worry that most unitaries of 
size n k (according to some chosen distribution on the set of circuits) would have a much shorter 
circuit decomposition, invalidating the connection of the time of equilibration with the circuit 
complexity of the diagonalizing unitary of the Hamiltonian. The fact that random circuits are 
not just approximate 4-designs, but even approximate poly(re)-designs is what allows us to prove 
Corollary[lO]and show that indeed there is no such considerably shorter decomposition in general. 
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Although Corollary [13] makes clearer the connection of fast subsystem equilibration and the 
complexity of diagonalizing the Hamitonian, it is still is not the kind of statement one would 
hope for. Indeed, to establish the connection in full one would like to show that for most circuits 
for which C e (U) is large enough (say, in the range n kl < C e (U) < n k2 for all sufficiently large 
k\ ^ /u"2 = ^2(^1))/ equilibration is fast. Here we can merely prove that most circuits U of suffi- 
ciently large size are such that the corresponding Hamiltonian enjoys fast equilibration of small 
subsystems and C e (U) is big. 

Another version of the claim that we would like to establish concerns the incompressibility of 
random circuits. A strong version of this conjecture would be that any e-covering of the set of 
t-gate random circuits has cardinality > (l/e) n ^ td2 \ See Proposition [8] and Lemma l25l for some 
much weaker claims in this direction. 

One difficulty in establishing such a conjecture is that the exact Hausdorff dimension of r-gate 
random circuits will depend on the gauge freedom determined by their overlaps. For example, 
an element of SU(A) has 15 real degrees of freedom, but three-qubit circuits of the form U12U23 
have 15 + 15 — 3 degrees of freedom, corresponding to the fact that the transformation U12, U23 i->- 
U12V2, V 2 U23 leaves U12U23 unchanged for any V2 £ SU(2). 



Application III: Generation of Topological Order. Topological order is a concept from condensed 
matter physics used to describe phases of matter that cannot be described by the Landau local 
order paradigm [44]. Roughly speaking topological order corresponds to patterns of long-range 
entanglement in ground states of many-body Hamiltonians. The intrinsic stability of topologically 
ordered systems against local perturbations also make them attractive candidates for constructing 
robust quantum memories or even topological quantum computation |45l.l46f |. 

In recent years it has emerged that it is fruitful to consider topological order as a property of 
quantum states, instead of quantum Hamiltonians (see e.g. |j47H49f1). There are two approaches to 
define topologically ordered states. The first is to say that a state has topological quantum order 
(TQO) if it cannot be approximated by any state that can be generated by applying a local circuit 
of small depth to a product state. Thus the state contains multiparticle entanglement that cannot 
be created merely by local interactions. In more detail, an n-qubit state 1-0) defined on a lattice has 
(FL ^^topological quantum order if for any parallel local circuit U of depth R, \ \ U | 0) ® n — | ifj) \ \ < e 

The second approach is to say that a quantum state |V>o) defined on a lattice exhibits TQO if 
there is another state orthogonal to it such that for all local observables 0\ oc , (V'olOioclV'o) ~ 
(0i|Oi oc |^i) and (0i|Oiocl^o) ~ |47-4§]. Thus one cannot distinguish the two states, or even 
any superposition of them, by local measurements. Quantitatively we say two orthogonal states 
\ipo), defined on a finite dimensional lattice have (/, e)-TQO if for any observable 0\ oc , with 
||Oi oc || < 1, supported on a set of diameter less than I, we have KV'olOiocailV'o) — (V , i|Oi oc |V'i)| < 
2e and |(V , i|Oi oc |V'o)| < £• As shown in Ref. Sa], if a state is (Z,e) topologically ordered 
according to the second definition, then it is also (1/2, e) topologically ordered according to the 
first definition. We remark in passing that topologically ordered states can also be understood 
as code states of any quantum error correcting code with large distance, and so the terminology 
"topological order" does not have to refer to any topological properties of the geometry of the 
qubits. 

In Ref. [470 it wa s shown that in any fixed dimension D a quantum evolution on n qubits, in 
the form of a local Hamiltonian or a parallel local circuit, cannot generate topological quantum 
order in time (or depth in the case of a quantum circuit) less than 0(n l / D ). The next corollary 
shows that in one dimension a generic evolution, chosen from the parallel local random circuit 
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model, saturates this bound. Thus almost every local dynamics in ID generates topological order 
at the fastest possible rate (according to the second and, hence, also first definition). 

Corollary 14. Let |V>o) and be two arbitrary orthogonal states on n audits, and let U be a random 



unitary chosen from the measure (vpLR,d,; 



\*300n 



, induced by 300n steps of the parallel local random circuit 



model. Then, with probability larger than 1 — 2 n / 8 ,/or every region X = {Iq, Iq + 1} of size I < n/4: 



tr\x (tflVoXV*)!^) - rx^ ' IK* (WiWil^) 



TX 



< 2" n/8 



and 



tr\x (VlVoX^ 1 ") 



< 2~ n/s , 



(24) 



(25) 



with tx the maximally mixed state in X and tr\ x the partial trace with respect to all sites except the one 
in region X. Thus the states C| Vo) an d U\ipi) exhibit (n/4, 2~ n / 8 )-TQO. 

Corollary [141 also shows that one dimensional parallel random circuits scramble \ V2, 5(1 51] - 
making an initial localized bit of information inaccessible to an observer that only looks at sublin- 
ear sized regions - in linear time, confirming the expectation of Refs. |12l.l50(] . 



IV. PROOF OVERVIEW OF THE MAIN RESULT 



A. Local random circuits 



The proof of part 1 of Theorem [5] consists of four steps, explained below. 

1. Relating to Spectral Gap: In the first step, following the work of Brown and Viola fl6l ] and 
Ref. Jl7ll (see also the earlier work |U), we rephrase the TPE condition from Definition [T] in terms 
of the spectral gap of a local quantum Hamiltonian. A local Hamiltonian on n D-dimensional 
subsystems is a Hermitian matrix H, acting on (C D )® n , of the form H = J2k Hk> where each 
acts non-trivially only on a constant number of systems. The spectral gap of H, denoted by A(H), 
is given by the absolute value of the difference of its two lowest distinct eigenvalues. 

Consider the following local Hamiltonian acting on n subsystems, each of dimension D := d 2t : 

n—l 

Hn,t '■= ^2 (26) 
i=l 

with local terms hi^ + \ := I — Pj,j+i acting on subsystems i, i + 1 and P^i+i defined as 

Pi,i+l-= [ (^M+l)®*'V H aar(dC/). (27) 

with I the identity operator and U® 1 ' 1 := U m <g> (U*) m . 
In section IFT1 we prove: 

Lemma 15. 



(28) 
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Lemma [15] thus shows that in order to bound the rate of convergence of the random walk 
associated to the random quantum circuit (for its first t moments), it sufficies to lower bound the 
spectral gap of H n ^. 

2. The Structure of H n j\ It turns out that H n ,t has a few special properties which make the 
estimation of its spectral gap feasible. 

Lemma 16. For every n,t > the following properties of ' H. n ,t = X^(^ ~~ ^m+i) hold: 
1. the minimum eigenvalue of H n ^ is zero and the zero eigenspace is given by 

g n , t :=span{\^ 4 )® n , := (J ® V d (*))\* d ) : tt e S t } , 



(29) 



with |$ rf t) := d~ 1 / 2 Y^t=i 1^' ^ maximally entangled state on (C d )®* ® (C d )®*, St t/ze symmet- 
ric group of order t, and Vd(Tr) the representation of the permutation tt & St which acts on (C d ) <s>t 
as 



V d (7r)\h} ® ... ® |it) = 1^-1(1)) 
2. Let G n ,t fee f/ze projector onto Q n ^. Ift 2 < d n , then 



\L 



i(*)>; 



(30) 



£i<^ M >r<i+^, 



and 



< 



Here we use the convention that ij) :- 



(31) 



(32) 



In particular, the quasi-orthogonality property of the states given by Eqs. (|3T|) and p2)l 
will be necessary to derive a good lower bound on the spectral gap of H n> t. 

3. Lower Bounding the Spectral Gap: With the properties given by Lemma [TBI we are in position 
to lower bound A(iJ n> t). To this aim we use a result of Nachtergaele 15211 , originally proposed to 
lower bound the spectral gap of frustration-free local Hamiltonians with a ground space spanned 
by matrix-product states (531.15411 . Using Nachtergaele's result in combination with Lemma Il6l we 
show in section lF3l the following: 

Lemma 17. For every integers n, t with n > [10 log(i)], 

A (#[2 log(d)- 1 log(t)],t) 



AH, 



n.t 



> 



8\og(d)-nog(t) 



(33) 



Lemmas [TBI andlTTldirectly show that for every t, local random quantum circuits of polynomial 
size are a e-approximate unitary t-design for every fixed t. Note, however, that they do not give 
any information about the dependence of t on the size of the circuit. 

4. Bounding Convergence with Path Coupling: The last step in the proof consists in lower 
bounding A(i7p 2 log(rf)- 1 iog(t)l,*)- We achieve this by using the connection of the random circuit 
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model problem with the spectral gap of H n ^ in the reverse direction: We upper bound the con- 
vergence time of the random walk on\](d n ) defined by the local random circuit in order to lower 
bound the spectral gap of A(iTr 21og ( d \-i i og (t)"i The point is that now any bound on the conver- 
gence time is useful. Actually in light of Lemma [17] it sufficies to prove an exponentially small 
bound on the convergence time in order to obtain part 1 of Theorem[5] and this is what we accom- 
plish. 

We consider the convergence of the random walk in the Wasserstein distance between two prob- 
ability measures v\ and v 2 on U(r), defined as 

W(vi,v 2 ) :=supi / f(U)vi(AU)- f f(U)u 2 (dU) : / : U(r) -»• R is 1-Lipschitz I , (34) 

(iu(r) JV(r) J 

where we say that / is 1-Lipschitz if for every two unitaries U, V , \f(U) — f(V)\ < \\U — V\\2, with 
\\X\\ 2 := tr(XtX) 1 / 2 theFrobenius norm. In section iGl we prove 

Lemma 18. For every integers k,n > 0, 

k 

w((v LR , n4 y^ k , maar ) <(i- en(d2 |i)n- 2 ) n l ^ n/2 - (35) 



The proof of Lemma [18] rests on Bubley and Dyer's path coupling method [55] for bounding the 
mixing time of Markov chains. In particular, we use a version of path coupling for Markov chains 
on the unitary group recently obtained by Oliveira j56ll 1 . 

Finally, it remains to show how Lemma [18] implies a lower bound on the spectral gap of 
^(^[2iog(d)- 1 log(t)l,t)- This is the content of the following Lemma, proved in section [HI 

Lemma 19. For every t, d > 1 and every measure v on V(d), 

g(u,t)<2tW(u, mmr ) (36) 

Part 1 of Theorem [5] now follows from the previous lemmas. 
Proof. (Part 1 of TheoremO Lemmas [TBI and [191 give that for every m, t, k, 

i 

1 \ (m-l)^ 



e m {d 2 + l) m " 2 



(37) 



1 - A(gm ' f) < (2tV2d m / 2 )^) ( 1 
m \ 

Taking the k — > oo limit we find, 

A(H m , t ) > m~ 2 e- m (d 2 + l)~ m . (38) 
Then by Lemma [TTI and the previous equation, with m = \2 log(l/ d) log(t)] , we get that for every 



n 



e -21og(d)- 1 log(t) ( - (i 2 + 1 - ) -21og(d)- 1 log(i) 1 

A(Fn ' t} - 81og(d)-ilog(t) " Fbg^' (39) 

Our result now follows from Lemma [T5l □ 



1 In fact the result of Ref. f56ll is more general and extends the path coupling method to Markov chains on a Polish 
length space. 
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B. Parallel local random circuits 

To analyze parallel local random circuits and prove part 2 of Theorem [5l we use part 1 of 
Theorem [5] and a recent tool for analysing quantum many-body Hamiltonians: the detectability 
lemma of Aharonov et al Js^]. 

Define 

1 1 

M n;t ■= ~Pl,2 ® P3A ® -Pn-l,n + 7^2,3 ® ... ® P n -2,n-l, (40) 

and let A2 (M n) t) denote its second largest eigenvalue. In analogy with Lemma ITBl it holds that 



f U^upLRAnidU) - [ U^ maai (dU) 

JV(d n ) JV{d n ) 



A 2 (M n , t ) (41) 



Let P odd := Pi j2 (g) P 3i4 (g) ...P n _ 1>n/ P ev en := P2,3 ® ••• ® Pn-2,n-i and P c be the projector onto 
the intersection of P dd and P eve n. Then 

Lemma 20. 

A 2 (M n , t ) < - + -\\P ddPeven ~ Pc\\oo- (42) 

Proof. We make use of the following result of f5^1 (Proposition 2.4): Given two projectors Q and 
R, \\Q + R\\ < 1 + ||QP||- Let P c be the projector onto the intersection of P D dd and P ev en. Applying 
the previous inequality with Q = P oc jd — Pc and R = P even — P c , 

ll-fodd ~l~ Peven II oo — 1 ~\~ ||Podd-feven Pcllooj (43) 

and so 

A 2 (M n , t ) = ||M„, t - PcHoo < - + ^UPodd^even " P C ||oo- (44) 



□ 



Now using the detectability lemma 115711 we can show: 
Lemma 21. 

WM tl ,<! + i( lt ^y"" (45) 

Proof. Since H n t is a frustration-free Hamiltonian with projective local terms we can apply the 
detectability lemma, which is the following bound 

ll^odd^even " Pc||oc < (l + ' . (46) 

The statement of the lemma thus follows from Lemma l20l □ 
The proof of part 2 of Theorem[5]now follows straightforwardly. 
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V. DISCUSSION AND OPEN QUESTIONS 



Our work shows that random unitary circuits resemble Haar-uniform unitaries in the sense of 
being approximate k designs. Of course, this is not the only possible criteria for rapid mixing. 
At least two other conditions that would be interesting to investigate are the rapid-scrambling 
criterion |5(1 51] and the log-Sobolev condition |59]. In general, it makes sense to investigate 



these conditions in the context of an application, and here too more work could be done to clarify 
questions such as which definition of e-approximate i-designs is most natural. 

For nearest-neighbor circuits in one dimension, our results are nearly optimal. In other ge- 
ometries, random circuits mix at least as well, but in this case, our results are likely to be far from 
optimal. For interactions on general graphs, it is plausible that parallel random circuits mix in 
time comparable to the diameter of the graph, which we establish only in the case of graphs of 
linear diameter. 

Another open question is whether our results can be strengthened to prove the incompress- 
ibility of quantum circuits. See Application II in Section [Til] for more discussion of this point. 

Finally, in physical systems, it is natural to consider random time-independent Hamiltonians, 
rather than random sequences of unitaries. Here the situation is qualitatively unlike that of clas- 
sical time-independent stochastic processes, since the phenomenon of Anderson localization pre- 



vents mixing in many cases (see [60] for a review). It is an intriguing open question to understand 
the cases in which rapid mixing nevertheless occurs, and in particular to give a physically plausi- 
ble derivation of the observed phenomenon of thermalization. 
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Appendix A: Facts about designs and tensor-product expanders 

Definition 22. Define the symmetric subspace of (C w ) 0i to be the set of vectors that are invariant 
under Vn(tt) (defined in Eq. (30\)) for all tt G St- Denote this subspace by V*C iV . Note that dimV*C iV = 

rry 

Before proving Lemma [3] we relate the design condition of Definition [2] to an easier-to-prove 
condition. 
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Lemma 23. For u a measure on \J(N), define: 

Pl/ :=(A u , t ®id)(<S>f) (Ala) 
Pfl«r-(A MH „t®id)($f) (Alb) 

T/zen 

1. The support of p v and pnaar is contained in S := V*(C Ar ® C^). 

2. 77ie minimum eigenvalue of puaar\s is > A r ~ 2t . 
3. 

G(i/,t) <iV 2t ||p,-p a arl!oo (A2) 

Proof. 1. Each density matrix is a mixture of states of the form ((U <S> /)$jv)®*, each of which 
individually belongs to V^C^ ®C N ). 



2. We will use Schur duality (see |61l.l62[ | for reviews). Schur duality implies that 



£ l/ d "" g !,f" PA |A.A)«|1> g )«l^), (A3) 



AePar(t,iV) 



where Par(t, iV) denotes the partitions of t into < N parts, denotes the irrep of U(iV) 
corresponding to partition A, V\ the irrep of St corresponding to A, and \®qn), \ &v x ) refers 
to maximally entangled states on pairs of these spaces. Then 

PHaar= / (ad Um (^ N ))^ p Haar (dU) (A4) 
JV(N) 



dim Q{ dim V\ 



AePar(t,7V) V / 

Restricting to V t (C N ® C^), we find that the minimum eigenvalue is min>, ^ dimg^ ' This 
minimum is achieved by the symmetric irrep A = (t), for which dim Pa = 1 and dim = 

, N+ t-U < ^ Thus Amin(pHaar | 5 ) > N ~2t_ 



3. This follows from parts 1 and 2 of this Lemma, along with (O. 



□ 



Proof of Lemma\3\ Let 6 := A^t - A MHaaIi t. Then 

||e||o = max 11(6(8) id) ||i (A6) 

i> 

= max max [trM+ (0 <g> id) ( V>) - trM_ (G <g> id) (^)l (A7) 
o<a/_,a/+</ V 

< 2e, (A8) 
where the last line used (©. 
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Conversely, suppose that 

IIA^-A^Ho^e (A9) 
Define p v , pHaar as in {Al) . Then by (|A9|l we have 

e > 1 1 Pj/ - PHaarlll > ||p* ~ PHaarlloo (A10) 

The desired claim now follows from Lemma l23l □ 
Proof of Lemma® For the first inequality observe that 

g{v,t) < JV*/ 2 ||A Vit - A m _ t \\. < 2N t / 2 G(u,t), (All) 



Here (1) is from part of Lemma 2.2.14 of 12411 (see in particular Fig 2.1 of 112411 ). with the defini- 
tion OPERATOR-2-NORM from |24f | corresponding to the TPE condition here, and (2) is from 
Lemma |3] 

For the second inequality, we use the fact that the 2—^2 norm is stable under tensoring with 
the identity map to obtain 

|| (A„ )t - A MfW ) ® id m || 2 _* 2 < g(u, t). (A12) 
Thus, defining p u , pHaar as in (|A1[) , we have 

g(v, t) > \\p v - PHaarlb > \\Pu ~ PHaarlloo- (A13) 

Thus, Lemma |23] implies that G(y, t) < N 2t g(v, t). □ 

Appendix B: Proof of Corollary[7] 

We now show how we can get convergence rates for other universal set of gates from our 
analysis of the Haar random case: 



Corollary (Restatement of Corollary~7| ). Fix d > 2. Let G = {gi\YL\ be a universal set of gates 



containing inverses, with each gi £ SU(cF) composed of algebraic entries. Then there exists C = C(G) > 
such that 

1. G-local random circuits of length C log(l/e) log(t)t 5 n 2 form e-approximate t-designs. 

2. G-parallel local random circuits of length Clog(l/e) log(t)t 5 nform e-approximate t-designs. 

Proof. Define the Hermitian matrix 

1 m 

It was proven in [26] that there is a constant A < 1, independent of t, such that for all t, 

P G ,t ~ [ t/ W PHaar(d£/) < A. (B2) 
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Moreover the eigenvalue-one subspace of Pc,t is the space in which f7®*>* /iHaar(df7) projects 
on, since G is universal. Define the local Hamiltonian in (C d )® n : 



n-l 



H G , n , t :=J2(I-PG,t) i>i+ i- (B3) 

i=l 

From Eq. (|B2|> : 

flG,n,t > Afln,t, (B4) 
with if n t given by Eq. (|26l) . Moreover H(j nt has the same ground space as H^t- Thus 

&(H G , n ,t) > AA(£T n , t ). (B5) 
The corollary now follows from Lemmas ITTl [l8l and [19] □ 

Appendix C: Proof of Proposition!!] 

In this section, we state lower bounds on the size of t-designs that match our results in Corol- 
lary[6]up to polynomial factors. 

First we argue that if v is an approximate t-design, it must have large support. More precise 
lower bounds are known for exact i-designs lEHl and for approximate 2-designs ll&jl l, but for our 
purposes, it will be enough to determine the rate of scaling. 



Lemma 24. If v is an e-approximate t-design on U(N) then 

\supp(v)\>(l-e)( N ~' 



N + t-l^ 2 



Proof. Let S = V f C N be the symmetric subspace of (C N ) m (c.f. Definition [22j> Define \<p) to be the 
maximally entangled state on S ® S. Since S is an irrep of U(iV) under the action U f/®*, it 



follows that (A MHaar t ®id)(<^) is the maximally mixed state on S®S. This has rank ( N+ f X ) . Thus, 

to approximate this state to within trace distance e requires a state of rank at least (1 — e) ( Ar+ / -1 ) 2 - 
Finally rank((A„ ; t ® id) (<p)) < |supp(^)|. □ 

To relate the cardinality of a design with the number of gates in a quantum circuit, we need to 
discretize the set of all quantum circuits. We say that a set X is an e-covering of Y if for all y G Y , 
there exists aniGl with d(x,y) < e for some distance measure d. 

Lemma 25. There exists an e-covering in diamond norm of size < (^) r (^f) for the set of circuits on 
n qudits comprised of <r tivo-qudit gates. 

Before proving the lemma, we note the following useful bound from part 6 of Lemma 12 of 
16411 that applies to any unitaries U, V: 

Hadcz-ady ||o < 2||[/-y|| 00 . (C2) 



19 



Proof. To describe an e-covering for circuits, it suffices to specify the location of each gate and to 
approximate each gate to accuracy e/r. The former has (o) r choices; the latter requires r copies of 
a e/r-covering for U(d 2 ). Finally, standard arguments 16511 show that such nets can be constructed 
with size < (5r/e) d4 for the operator norm. We convert operator norm to diamond norm using 

Q . □ 

Combining these results we can prove that t-designs require large circuits. 



Proposition (Restatement of Proposition 8 1. Let u be a distribution with support on circuits of size 
smaller than r. Suppose that v is an e-approximate t-design on n audits with e < 1/4 and t < d n l 2 . Then 

nt 

r - bd^Hnty (C3) 

Proof. Let the distribution v be an e-approximate unitary t-design with all elements composed of 
r two-qudit gates, possibly including the identity. From Lemma [25j construct a diamond-norm 
5-covering for the set of circuits on n qudits, and denote it by C$. Consider a new distribution 
v(dU) in which each unitary U is replaced by its closest unitary U G C$. We claim that {u(dU), U} 
is a (e + t<5)-approximate unitary t-design. Indeed 

l|A*,t - A MHaar , t || < ||A M - A MHaarit || + WA^t - A^|| (C4a) 
< e + max min || ad[/8>t — ad^^t ||<> (C4b) 

C/6supp(v) u&Cs 

<e + 2 max min ||C/®* - C/^Hoo by Q (C4c) 

u&up P (u) uec's 

<e + 2t max min ||J7 — U\\oo hybrid inequality (C4d) 

<e + 2tS, (C4e) 

See Fact 2.0.1 of J6^] for a statement and proof of the hybrid inequality. 

Choosing 5 = e/2t we get that the distribution u is a 2e-approximate t-design. Now we invoke 
Lemmas 1241 and l25l to bound 

(i-2 £ )r + ;-M<isup P (,)i<(:y(^y d4 .. (C5) 



t J ~ 1 rr v " - \2J V e 
After some algebra, we obtain the desired bound on r. which implies that 



r " 2lo^- (C6) 

□ 



Appendix D: Proof of Corollaries l9l and ITU 

In the proof of Corollary [9] we make use of the following lemma due to Low, which gives a 
measure concentration result for t-designs. Our definition of approximate t-designs differs from 
his by a normalizing factor, and we have adjusted the statement of the result accordingly. 
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Lemma 26 (Low, Theorem 1.2 of ||20j]). Let f : U(D) — >■ R be a polynomial of degree K. Let 
f(U) = Yli a iMi(U) where Mi(U) are monomials and let a(f) = Y2i \ a i\- Suppose that f has prob- 
ability concentration 

Pr (|/(17) - p\>5)< Ce~ a& \ (Dl) 

U^HHaar 

and let v be an e-approximate unitary t-design. Then for any integer m with 2mK < t, 

^ (1/ " A* ^ ^ { C + 2 < a + W) 2 " 1 ) • (° 2 ) 

Let us turn to the proof of Corollary [9J 



Corollary (Restatement of Corollary 9 1. Let (vLR,d,n)* s be the distribution on U(d n ) induced by s steps 
of the local random quantum circuit model. Then 

rW )-^|>*)<(9 M4 .3(^)' /4 , (D3, 



Pr I max 

U~(v LR , d , n )" \M&size(r) 



with 

1/6 



n 3 log(d) log(s) 



(D4) 



and where the maximization is realized over all two-outcome POVMs {M, id — A/} which can be imple- 
mented by quantum circuits of size r. 

In particular, for fixed d, all but a 2~^( n ) -fraction of states generated by circuits of size n k (for k > 4) 
cannot be distinguished from the maximally mixed state with bias larger than n~ n ^ by any circuit of size 
n~ polylog _1 (n). 

Proof. Consider a fixed POVM element < M < I. Let us apply Lemma l26l with /m(^7) : = 
(0 n |C/tMC/|0 n ). We have D = d n , K = 2 and u = tr(A7)/d n < 1. We can upper bound a(f) by 
52ij \Mij\ < d 2n . Moreover, by Levy's lemma 16711 , 

jt Pr (\f M (U) -fi\>S)< 2e-^r. (D5) 



Choose 



and e = jy^r-Tvn^ ■ ( D6 ) 



n 3 log(d) log(s) J V ( d2n + l Y d 

From Corollary [6] we get that local random quantum circuits of size s form an e-approximate 
unitary t-design. Then using Lemma l26l with m = t/4 

^L„ (l/ " (t/) " "I £ m £ WW ( 2 (™f + e(d2 " + 1),/2 ) 

-(H)" 4 



21 



Next, we let S r be a 5/2-covering of the set of circuits of size r (see Lemma l25l for a definition). 
By Lemma [25] and using n < r, we can assume 

rdi /r \2rd 4 

(D8) 



The quantity we are interesting in upper bounding is 

I max |/ M (C7) - M | > *) < Pj (max |/ M ([7) - M | > 8/2) (D9) 



Using a union bound, this latter probability is 



(D10) 



Substituting our choice of i, we see that this probability is negligible when 5 and d are constant 

and rlog(r) <C s 1 / 6 ™ 1 / 2 . □ 

Finally we prove 

Lemma ITU (restatement) . Let Cy be a quantum circuit of size r on< r audits that makes use of a unitary 
oracle U onn < r audits. That is, each gate in Cjj can either apply an arbitrary two-audit gate to any pair 
of audits, or can apply U to its first n qudits. Then 



') 



/^■Haar 

(dU) 



< e, 



(Dll) 



for any e > and s > nr 4 log(r)(6nr log(<f) + log(l/e)). In other words, random circuits cannot be 
distinguished from Haar-random unitaries by significantly shorter circuits. 



Proof. Let -I < M < I and \if>) G (C d ® C d )® n be such that 



X 



ad Cu ^LR,d,n( dC/ ) - / ad Cu ^Haar(dC/) 



(D12) 



= tr [Ml / (C l7 ®/)^)(Vl(C C /®/) t z,r R , d , n (dC/)- / (C u ®lM(?P\(C u ®I? maax (dU) 
\ \Jv(d n ) Jv(d n ) / 

Using repeatedly that tr((^4i ® ... <g) vlfc)Vi r .. ) fc) = \x(A\A2...Ak), for Vi ... ^ a representation in 
(C d ) 81 of a cycle, we can write 



f 17®*-* ® 7® 2W ^ R d n (dC7) - / [7®*'* ® /® 2 ™/i H aar(dC/) 



U(d n ) 



(D13) 



with t < r and 7 := L\...L V , where each is given by a tensor product of unitary operators and 
Thus ||7||i < d^HLlloo < d 4rn and so 



X < ||7||i 



: d 



Am 



[ [/»*'* ® /^ 2 "Xr d „(dtO " / U m > 1 ® 7® 2 ™/. H aar(d[/) 
/ ^'^W(d£/) " / f/ 0t 'VHaa r (dC/) 



l ) JU(d") 

The statement follows from part 1 of Theorem|5] 



(D14) 

□ 
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Appendix E: Proof of Corollary H4l 

Corollary [H] (restatement). Let \i/jq) and \%fji) be two arbitrary orthogonal states on n audits, and let U 
be a random unitary chosen from the measure (vpLR,d,n)* 300n , induced by 300n steps of the parallel local 
random circuit model. Then, with probability larger than 1 — 2~ n / 8 ,/or every region X = {lo, Iq + 1} 
of size I < n/4: 



tr\x (uWoHHtf 



tr\x (uWiXHU 1 



TX 



< 2" n/8 



and 



tr\x (^|^o)^i|C/ t )|| i <2- m / 8 , 



(El) 



(E2) 



with tx the maximally mixed state in X and tr\ x the partial trace with respect to all sites except the one 
in region X. Thus the states U\ipo) an d U\ij)-\) exhibit (n/4, 2~ n ^ 8 )-TQO. 



Proof. A standard estimate gives that for a Haar random unitary U : 



E 



TX 



< 2~( n_/ ). 



From part 2 of Theorem[5]we find that 



(E3) 



E 



'-'PLR. d,n 



tr\x [U\*l> )(ik 



TX 



< 2~ {n ~ l) + 2" 



(E4) 



El) then follows from the relation ||^||i < yD\\Z W2, valid for any D x D matrix Z. 
The proof of (|E2|) is completely analogous, we just have to note that 



E 



< 4 • 2 



-(n-l) 



(E5) 
□ 



Appendix F: Proof of Lemmas for Theorem[5] 



1. Proof of Lemma[15l 



We start proving Lemma [I5j which is restated below for the convenience of the reader. 
Lemma|l5l (restatement). 

^LR,M,t)<l-^^. (Fl) 
n 

Lemma III.l For every k > 0, 

g{^nAY\t) = g{^t) k ={l-^^^\ (F2) 
with (p> n ,d)* k the k-fold convolution of the measure [i n ^- 
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Proof. Let us start proving the first equality in Eq. (|F2]| . Following jlflll^) , we have 



with A2PO the second largest eigenvalue of X. We now note that 

/ U^\^ d )* k {dU) = (-J2Pi,i+i) , 

and so indeed g{{^ n ,d)* k ^) = 9{v>n,d> t) k - 

The second equality in Eq. (|F2|) , in turn, can be obtained as follows: 



= A 2 ( 



I u^fin^du)) = a 2 f -y;p M+ i 1 



n 



(F3) 



(F4) 



(F5) 
□ 



2. Properties of iJ n t 

We now prove Lemma [161 
Lemmallol (restatement). For ecery n, t > the following properties of H n)t = Sj(F — Pi,i+i) fro/d: 

2. £/ie minimum eigenvalue of H n t is zero and the zero eigenspace is given by 

g n>t := span{\^ d )® n , \^ >d ) := (I ® V d (x))\* d ) : tt € S t } , (F6) 

wz'f/z := <i~'/ 2 2^fe=i the maximally entangled state on (C d )®*(g)(C' i )® t , St thesymmet- 
ric group of order t, and V d (-K) the representation of the permutation tt £ St which acts on (C 1 )®* 
as 



V d {Tr)\h) ® ... ® \l t ) = |J w -i ( i)> 
2. Let G„,i fee f/ze projector onto Q n p Ift 2 < d n , then 



\K~ 1 {t))'i 



(F7) 



(F8) 



and 



£ <3 - G ^ 



< 



(F9) 



Here we use f/ze convention that tjj :- 
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Proof. Item 1. Since each Pa+i, we have that the smallest eigenvalue of H is > 0. Let us now 
determine the ground space. 

1 n 

H n , t \<p) = O - V P M+ i | v?) = | (F10) 



n . 



^Vie [n],We ^(d 2 ),^- 1 ®?/®^-*- 1 )®*'*^) = \ip) (Fll) 
O VZ7 G C/(d n ), 17®*'V> = Iv) ( F12 ) 

Here Eq. (|F12|) is because nearest-neighbor unitaries generate the set of all unitaries [68fl. To justify 
HF11|) . observe that 

Re(^|E i6N P M+ i|<^) =ReE i ^ [n] E [/i i+1 (( / 9|J7 M+ i|^) < 1 

with equality if and only if Ui^ + \\<p) = \<p) for all but a measure-zero subset of the (i, E/j,i+i) pairs. 
And by continuity we can assume this subset is empty. 

We can without loss of generality write \cp) = ® M)\<& d nt) for some matrix M. In terms of 
M, Eq. (|F12|) implies that | y?) is a ground state of H Ujt if and only if M commutes with U®* for all 
U G U(d n ). It is well-known (see kill , or j6^] for a quantum information perspective) that the set 
of such M is precisely given by the span of the Vd(ir) for ir G St 2 . 

Item 2. Eq. ([F8]> follows from 



EiftM^Mr = ^EMM^)*vw r ) 

7re5 t 

^E tr (^(™ -1 )) 



7rg5t 7rS<St 



7TG5 



ft 

= ^tr(P symiM »), (F13) 

with P sy m,t,d™ the projector onto the symetric subspace of (C d ")®*. The first equality follows from 
the definition of |Vv,d) an d the relation Vdn(ir) = {Vd{n))® n , the second and third from the fact 
that St is a group and (it) a representation of n, and the last from the relation 

Psym,t,d» = £ E ^"W' ( F14 ) 

' TTSSt 

Using tr(P syilljt|(2 n) = {d n + t- l)...(d n + ljtf 1 /*! , Eq. dF13b . and our assumption that t 2 < d n , we 
obtain 

V^,/, v.n (d n +t -!)...( d n + l)d n , t 2 

E KWiMr = - ^ — — < i + (Fi5) 



2 Note that the form of the eigenspace of //„, t follows directly from the fact that random circuits drawn from a univer- 
sal set of gates converge to the Haar measure, a fact that was first proven in 17011 . A more direct proof of convergence 
can be also obtained by applying general sufficient conditions given by Theorem 3.3 of [71] for Markov chains to 
converge to a unique invariant measure (73l . 
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To prove Eq. (|F9)) , let B := X^eSt KH^d]®™/ with {|vr)} 7re 5 t an orthornomal set of vectors. 
We have 



BE* - K)<7r| 

7T6<S t 



(F16) 



where we used Eq. (|F15|) and the fact that the operator norm of a matrix is always smaller than 
the column norm. Since BB^ has the same eigenvalues as B^B and 



An,f-= Yl (\^,d)(^,d\f n = BtB , 



(F17) 



we find (1 — ^n)G n j < Ai,t < (1 + jjn)G nt t, where we used that G n> t is the projector onto the 
support of A U:t . Thus 



\A n> t Gn,t | loo < 



which is Eq. dF9l ). 



(F18) 

□ 



3. Proof of LemmafT7l 

We start defining the necessary notation to state the result of Js^] which we employ. We con- 
sider a chain of systems with local finite dimensional Hilbert space % labeled by natural numbers 
(excluding 0). We consider a family of Hamiltonians 

n-l 

H[ m ,n] = 'Y ( F19 ) 
i=m 

acting on %®( n ~ m ) f where hij+i are the nearest neighbor interaction terms, which are assumed to 
be projectors. In words, #r mjn i includes all the interactions terms for which both systems belong 
to the interval [m, n]. We also let the chain be translationally invariant, i.e. /ij^+i are the same for 
all i. We assume further that the minimum eigenvalue of H^ m ^ is zero for all m, n and denote by 
G[ m ,n] the ground space of ff[ m>n ], namely 

a [m ,„]={|V)G^ {n - m) : H Kn] |V)=0}. (F20) 
Finally let Gt m jU t be the projector onto G\ m>n ]. 



Lemma 27 (Nachtergaele, Theorem 3 of 1520). Suppose there exist positive integers I and n\, and a real 



number t\ < \j\fl such that for all ni < m < N — 1, 

H^Ai ® Ga 2 b (Ga ± a 2 ® Ib - Ga^b) ||oo < Q (F21) 
with A\ := [1, m — I — 1], A2 := [m — l,m — 1], B := m. Then 

A(H M ) > A(H M ) r 1 -^ 2 ) • (F22) 
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We can now prove Lemma III.2, restated below: 
Lemma III.2 For every integers n, t with n > \2t log(t)] 

A(H 



A(H n! t) > 



r2iog(rf)-Mo g (t)i,t; 

81og(d)- 1 log(t) 



(F23) 



Proof. We apply Lemma l27l with n\ = 21 and ei = l/(2y/l). Then we must show that for all m in 
the range 21 < m < n, 



\\Ia 1 ® Ga 2 b {Ga 1 a 2 ® Ib - Ga 1 a 2 b) ||oo < 
with A\ = [1, m — I — 1], A2 = [m — I, m — 1] and B = m. Let 



7re<S t 



By Eq. (|F9)) of Lemma [161 we have 



2x/I' 



(F24) 



(F25) 



Then 



M 



\ G [l,...,k] --^fclloo < jk- 



\Ia 1 <8> Ga 2 b (Ga 1 a 2 ®Ib~ Ga^b) ||oo 

6t 2 



< \\I Al ®Xt(X m -i®I B -X m )\\ 0O + 



E(l^>^Ml)® (m " ,_1) ®^ 



+ 



6f 
d* 



with 



(F26) 



(F27) 



In the remainder of the proof we show 



7r65t 



(F28) 



< 1 + max 



(F29) 



Then, 



M < 1 + 



/ t 2 \ 

1 H ; r~r max Hl^lloo + 

V 2d™-'- 1 / 11 n 1100 



6*2 
d* 



< 1 + 



2t l 



1m— I- 



<T^7V 

2t 2 \ 2t 2 6t 2 10t 2 



6^ 
If 
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where the before-last inequality follows from (|F8|) of Lemma HH Then, choosing I > 
2 log(t) / log(d) we find M < (2^/nT 1 , and we get Eq. (TF231 from Lemma[27] 

Let us turn to prove Eq. (|F29|l . Consider the linear map 



B k := J2\^,d) m (K\, 
TreSt 

with {|7r)} 7re 5 t an orthornomal set of vectors. Using Eq. (|F9|) of Lemma [161 we have 



(F31) 



l-Bfc-Bfclloo 



X] (IV'-7r = d><V , 7r,d| 



(F32) 



Then 



7TG<S t 



^(^(m-I-ljkX^I^L-I-l))®^ 



< 



7TG<S t 



(m-Z-l)-S( m _;_i) 



^(m-«-l)-B( T m _ Z _i) 



^ |vr)(7r| <g)l^ 
max IIKrll 



and Eq. (|F29|) follows from the bound given by Eq. (|F32|) . 



(F33) 
□ 



Appendix G: Proof LemmaH8l 

For two probability distributions v\, v 2 , we say (X, Y) is a coupling for ui, u% if X and Y are 
distributed according to v\ and z/2, respectively. Define the L p Wasserstein distance between two 
probability distributions v\ and v 2 as follows 

Wp{v\,v 2 ) '■= inf |e[c?(X, YY] 1 ^ : (X, Y) is a pair of random variables coupling (y\, z^) j • 

(Gl) 

We note it holds that [56] 



W{vx,V2) = Wxiyx^) < W 2 {u ll u 2 ). (G2) 



with W(v\,v 2 ) given in Eq. (|34jl . 

We now state Oliveira result (in fact a particular case of Theorem 3 of 156;]), which offers a 
version of the path coupling method for Markov chains on the unitary group. It shows that a 
local contraction, in the L 2 Wasserstein distance, can be boosted into a global contraction. 



Lemma 28 (Oliveira, Theorem 3 of |56f|). Let vbea probability measure on V(d) such that 

( W 2 (v * v * 8tj. 
e->o c/i,f/ 2 6U(d) I \\Ui-U 2 \\ 2 



limsup sup | ii^^m : \\Ul-U 2 \\2<£ \ <r], (G3) 



with 5u a mass-point distribution atU£ V(d). Then for all probability measures v\,v 2 on V(d), 

W 2 {v*vi,v*v 2 ) < r]W 2 (u 1 ,u 2 ). (G4) 
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In the rest of this section we apply Lemma [28] to prove Lemma [TH1 Before we turn to the proof 
of Lemma [THl in earnest, we prove the particular case of the random walk on three sites. Then in 
the sequence we will built up on it to get the general case. 

Lemma 29. For every integer k > 0, 

W((^ d )* 2k , maar ) <(l- ^^y 2 (G5) 

Proof. We will show 



limsup sup < ■ — 7— : \\Ui — U2W2 S £ > < V .— 



e^O Ui,U 2 €V(d 3 ) I, \\U1-U2W2 J" V 2d 2 + 2 

(G6) 

Then applying Lemma l28l repeatedly we find 

W 2 ((fi 3 , d )* 2k , /UHaar) = W 2 ((H3,d)* 2k * 6 h (^dT^ * MHaar) 

j /^Haar) 

< r] k V2d 3 / 2 , (G7) 

where in the last inequality we used that W^^/, /^Haar) < maxu lt u 2 \\Ui — U 2 \\2 < V2d 3 ^ 2 . The 
statement of the lemma thus follows from Eqs. (|G7|) and (|G2|| . 

Let us turn to prove Eq. dG6b . Let R\ and R 2 be two unitaries acting on three d-dimensional 
systems. Consider two steps of the walk. Then we have four possibilities, each occuring with 
probability \, 

Ri -> {^12^1, WuRi, U 12 U 23 R U U23U23R1} , (G8) 

for independent Haar distributed unitaries Ui 2 , U 23 , Ui 2 , U 23 , and likewise for R 2 . Here the indices 
of the unitaries label in which subsystem they act non-trivially. 

At the moment we have a trivial coupling, i.e. R\ and R2 are subjected to the same transforma- 
tion. Now we introduce a nontrivial coupling, which we show on average brings two infinitesi- 
mally close unitaries closer to each other. We consider the tranformation: 

Ri -> {Ui2U 12 Ri, UzMaUuRi, Ui2V X2 U 23 R u ^23^23^1} (G9) 

where the unitary V23 can depend on U 2 \ and V12 can depend on U 23 , and of course both can 
depend on R\ and i?2- The unitary R2, in turn, undergoes the same transformation as before, 
namely 

R2 -> {^12^2, U23U 12 R 2 , U 12 U 23 R2, U23U23R2} (G10) 

Let us check that the transformations above indeed define a valid coupling. In order to do so 
the induced distribution on the two unitaries R\ and R2 must be the same as in the case of a trivial 
coupling. This is clearly true for R2. To see that it is also true for R±, we observe that for any fixed 
V23, C/23V23 is Haar distributed for a Haar distributed ^23 (and likewise for Ui 2 Vi 2 ). 

In the sequel we show 

E(||X-y|| 2 ) ^EOlXo-YoH 2 ), (Gil) 
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where X and Y are random variables related by the coupling and Xq and Yq are infinitesimally 
close points. In our case of interest the L.H.S. of the above equation reads 

E (||X - Y\\ 2 ) =](e (\\U 12 U 12 Ri - UviUuBaWt) + E (\\U 23 V 23 U 12 Ri - U 23 U 12 R 2 \\l) + 



4 

E [WuVnU^R! - U 12 U 23 R 2 \ty + E (\\U 23 U 23 R\ - U 23 U 23 R 2 \\ 2 )) , (G12) 

with the expectation taken over Haar distributed U\ 2 , U\ 2 , U 23 , U 23 . Using the unitary invariance 
of 2-norm we can rewrite this as 

E (\\X - Y\\ 2 ) = X - (2||i?! - R 2 \\ 2 2 +E (||V 23 [/i2i?i - U l2 R 2 \\l) + E (\\V\ 2 U 23 Ri - U 23 R 2 \\ 2 )) . 

(G13) 

Since V\ 2 and V 23 can depend in an arbitrary way on U 23 and U\ 2 , respectively we can take the 
minimum over V\ 2 and V 23 to get 

2N (n\\t>. e> 1 12 i w / m \„ MTZ._rr. _ rr_ z?_ 1 1 2 



E(||X-Y||^) = - (2||i?! -i? 2 ||2 + E min||y 2 3^i2-Ri-^i2^2||2 
4 \ V23 

+ E hnin||Fi 2 £/ 23 i2i - ^23^2! ||J J ■ (G14) 

For any two unitaries U\ , U 2 we have 

||E7i - U 2 g = 2 (tr(I) - Re (tr(£/i . (G15) 
Since i?i and R 2 are infinitesimally close we can write 

,2 

:= R x Rl = e ieH = I + ieH - —H 2 + 0(e 3 ) (G16) 
for a Hermitian matrix H with ||H || 2 < 1. Then applying ( IG15I) we get 

II^Ri - ^2||1 = e 2 tr(iy 2 ) + 0(e 3 ) (G17) 

Let us now consider the term E (miny 12 ||T^i2^23-Ri — U 23 R 2 1| 2 ) (for the other term the calcula- 
tions gives the same result). We have 

E (mm\\V 12 U 2 3Ri - U 23 R 2 \\ 2 ^j = 2 Mx(I) - E (max |tr(Fi 2 tf 23 i?ii44i)lJ ) 

= 2 (tr(I) - E||tr 3 ([/ 2 3 J R^ 2 t 3)||i) , (G18) 

with i? = RiR\- The last equality follows from the following variational characterizarion of the 
trace norm: ||X||i = max(/ e u |tr(C/X)|. 

From Eq. (|G16|) we get 

tr 3 (U 23 RUl 3 ) = dl 12 + ietr 3 (U 23 HUl 3 ) - e l t r 3 (U 23 H 2 ul 3 ) + 0(e 3 ). (G19) 

An easy calculation shows that for any two operators A, B we have 

2 2 

Hl + ieA-y^Hi =tr(I) + i-(trA 2 -tr5) + 0(e 3 ). (G20) 
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Hence we obtain 



||tr 3 ([/ 2 3M4)lli = MI) + ~tr((tr 3 (U 23 HUl 3 )) 2 ) - € \(H 2 ) + 0(e 3 ), 



(G21) 



so that by Eq. dG18l 



mfE\\V 12 U 2 3Ri - U 23 R 2 \\ 2 2 = e 2 

V12 



tr(H 2 ) - -E (tr((tr 3 (U 23 HU^)) 2 ) 



+ 0(e 3 ). (G22) 



Now our goal is to compute the average E \ tr [(tr 3 (U 23 HUl 3 )) 2 ) J ■ We note that for any opera- 
tor C123 we have 



tr(Cf 2 ) = tr(Ci 23 <g> C m F 12:l2 ® I 



(G23) 



where systems with bars are copies of original systems, and F is the operator which swaps systems 
12 with 12. Therefore 



E(tr{tr 3 (U 23 HUl 3 ) 2 )) = E ^(fTm ® <8 ^)(F UiS ® %)(t^3 ® %))J • (G24) 

We now compute 



E ^([4 C4)(F 2;2 ® I 3 3)(C/ 23 ® £%)J = ^^23:23 + ^23:23) 



(G25) 



Using the fact that the tensor product of swap operators is again a swap operator (e.g. F 12 .^2 
F 1: y ® F 2:2 -)/ we obtain 



d 



E(tr{tr 3 (U 23 HUl s ) 2 )) = ^-j (tr{H 123 ® %F mB! ) + tr(tf 123 ® %F liT ® ^ 

= ^(trC^+trCfl?)) 



> 



+ 



d 2 + l 



trfiP 



Inserting this into dG22|) 



inf E (||^ 12 C/ 23j Ri - C/23^2||2 < e z 1 
Via V « 2 + 1 



tr(F 2 ) + <3(e 3 



Finally using Eq. (|G14|) , 



E(||X-y|| 2 )< £ 2 tr(H 2 ) + 0(e 3 ). 



and we are done. 



(G26) 
(G27) 

(G28) 
□ 



Remark (Why one step of the walk does not work): It is instructive to see why coupling only one 
step of the walk does not seem to be enough to prove contraction. In this case, a general class of 
couplings is given by 



Ri {^12^12-^1,^23^23^1} 
R 2 — > {Ui 2 R 2 , U 23 R 2 }, 



(G29) 
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where V± 2 and V23 can depend only on R\ and i?2- If we optimize over the choice of V12, V23 we 
get 

E(||X-F||I) = 2tr(I)-E(||tr 3 ( J R)|| 1 )-E(||tr 1 ( J R)|| 1 ) = e 2 (tr(i7 2 ) - i(tr(fl? a ) + tr(F 2 2 3 )) +0(e 3 

(G30) 

where H12 = tr 3 (7T), and H23 = tri(TT). However there exist Hermitian matrices H such that 
H l2 = H 23 = 0, in which case E\\X - Y\\ 2 2 = e 2 tr(H 2 ) + 0(e 3 ) = \\X Q - Y \\l, and so that we 
do not have any contraction. We can thus understand the role of the second step of the walk in 
constructing a useful coupling: it is to randomly change such bad cases of H into good H, with 
non-zero probability. □ 

We are now in position to prove Lemma [181 

Lemma[l8]For every integers k,n > 0, 

k 

W((^ d T^, maai ) < (l - en{d 2 l +l)n _ 2 ) 11 V^ /2 - (G31) 
Proof. We will show that 

hmsup sup < ■ — : \\Ui - U 2 \\ 2 < £ > < V, (G32) 

e->0 U!,U 2 £V(d n ) [ \\Ul - U2W2 J 

with 

e n (d 2 + 1 



1 

n-1 



V-= ( 1 - 3T73^-^ ) (G33) 



Then, in analogy to the proof of Lemma |29l 

W 2 ((fln,dT (n - 1)k , MHaar) = W 2 {{^ d f^ 1)k * 5 h (/Vd)*^"^ * ^Haar) 

, A'HaarJ 

< V k V2d n/2 , (G34) 
and the statement of the lemma follows from the bound W / ((/i n ,cz)* (n ~ 1 ^i MHaar) < 

W2((Mn,d)* (n - 1)fc ,^Haar). 

Let us turn to prove Eq. (|G32|| . In order to avoid the problem that occured when we applied 
a single step of the walk to three systems, we now need to apply k = n — 1 steps of walk. There 
are then k k possible paths, and we make a nontrivial coupling only for k\ of them. Namely, for 
those paths for which no pair of systems is repeated, i.e. for the case U n -\ n . . . U23U12 and all its 
permutations (all sequences which come from permuting the order of the unitaries in the sequence 
above). For those k\ paths we consider the following coupling 

Ri -> Ui n _ liin _ 1+ iV in _ 1: i n _ 1+ i . . . Ui 2: i 2+ iU iliil+ iRi 

R2 ->■ U in _ l! i n _ 1+ iV in _ lt i ri __ 1+ i . . . Ui 2j i 2+ iUi u i 1+ iR<2 (G35) 

where V can depend on all unitaries sitting to the right, and ij G {1, . . . , n — 1}. We now consider 
explicitly a particular sequence U12U23 ■ ■ ■ U n -\n an d compute the analogue of (|G18|) (for the other 
sequences the calculations give the same result). We have 

inf E (\\U 12 V 12 U 2 3 • • • Un-lnRl ~ ^12^23 • • • U n . ln R 2 \\ 2 2 ) 

= 2 (tr (I) - E (||tr 3 ... n ([/ 2 3 • • • U n - ln RU^ . . . ^_i n )||i)) . (G36) 
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Expanding in e we get in analogy to (|G22|) : 

inf E||C/ 12 y 12 ^ 23 . . . Un-mRi - U 12 U 23 . . . U n - ln R 2 "< 



V 12 

,2 



1-2 1 1 2 



tr (^ 2 ) " ^2Etr((tr 3 .„„(Cr 23 • • • U n - ln HU 23 . . . E/J_ ln )) 5 



+ 0(e 3 



(G37) 



Moreover, using repeatedly (|G26|) we obtain 

E (tr((tr 3 ... n (C/ 23 . . . U n - ln HU 23 . . . ^_ ln )) 2 )) > ^ 



n-2 



so that 



inf E (\\U 12 V 12 U 23 • • • U n - ln R! - U 12 U 23 . . . U n . ln R 2 \\ 2 2 ) < e 2 [ 1 
V12 \ (a + 1) 



tr (H 2 



1 



(G38) 



(G39) 



Finally we have k k — k\ paths of walk for which we do not have any shrinking (as our coupling 
was trivial for those paths) and k\ paths in which we have a shrinking factor of 1 — (d 2 + 1)~(™ -2 ) . 
Thus this gives 



with 



E (||X - < e 2 X \\X - Y \\l + 0(e 3 ) 

(n-1)! 



1 



(n - l)^" 1 ) (d 2 + 1)«- 2 
where we used the bound n! > n n e~ n . 



< 1 



e n (d 2 + l) n - 2 



(G40) 

(G41) 

□ 



Appendix H: Proof of Lemma[T9l 

In this section we prove the last lemma needed in the proof of Theorem|5j 
Lemma|l9l (restatement). For every t, d > 1 and every measure v on U(d), 

g{u, t) < 2tW(v, /j, Haar ) 
Proof. The definition of <?(•, •) states that 



U^(u(dU) - / U^mUdU) 

V(d n ) JV(d n ) 



Let X be such that I \X\ U < 1 and 



tr 



v{&U)U { 



it,t 



(HI) 



(H2) 



(H3) 



That such a X always exists follows from the following variational characterization of the operator 
norm: ||^4||oo = maxx{tr(AX) : ||X||i < 1 }. 
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Define f(U) := \x{U® t ' t X). We claim / is 2i-Lipschitz. Before proving it, let us show how it 
implies the statement of the lemma. Indeed, since f/(2t) is 1-Lipschitz, 



n 



21 



f{U)/{2t)v{&U) - / f{U)/{2t) m ^(dJJ) 



< 2tW(v, ma 



(H4) 



where the last inequality follows from the definition of the Wasserstein distance, given by Eq. 

It remains to show that f(U) is 2t-Lipschitz. This follows from 

\f(U)-f(V)\ = |tr((*7 w -F®*'*)X)| 

< ||X||i||E/®*'* - V^Hoo 

< _ V^Woo 

< 2t\\U-V\\ OQ 

< 2t\\U-V\\ 2 (H5) 

The first inequality follows from the relation tr (A' B) < ||^4||i||i?||oo arid the second from the bound 
ll-^lli — 1/ an d t ne third from the hybrid argument; that is, by repeatedly applying the inequality 



\\A®B-C® D\\oo < \\A - C\\oo + \\B - DWoo, 
valid for unitaries A, B,C and D. This, in turn, follows from 

\\A®B- C^DWoc = \\A® (B - D) + (A-C)®D\\ 00 

< \\A® (B-DJUooH- 1| (A -C) (8) D||oo 
— Halloo \\B — -D||oo 4" ||-D||oo \\A — Cjloo 

< HB-Dlloo + IIA-Clloo. 



(H6) 



(H7) 

□ 
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